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Detailed Action 



Drawings 

Objections 

1 . The drawings are objected to because of the following. 

a. The handwritten captions depicted in Fig. 1 (e.g. in reference numbers 121 and 125) are 
illegible and/or are such that they cannot be adequately reproduced. These should be replaced 
with typed captions to ensure visual clarity and proper reproduction. 

b. The handwritten captions depicted in Fig. 4 (e.g. in step SI 7) are illegible and/or are such that 
they cannot be adequately reproduced. These should be replaced with typed captions to ensure 
visual clarity and proper reproduction. 

c. The handwritten captions depicted in Fig. 1 1 are illegible and/or are such that they cannot be 
adequately reproduced. These should be replaced with typed captions to ensure visual clarity 
and proper reproduction. 

d. The handwritten captions depicted in Fig. 13 (e.g. in reference numbers 1301-1302) are 
illegible and/or are such that they cannot be adequately reproduced. These should be replaced 
with typed captions to ensure visual clarity and proper reproduction. 

e. The handwritten captions depicted in Fig. 20 (e.g. in reference numbers 1301-1302) are 
illegible and/or are such that they cannot be adequately reproduced. These should be replaced 
with typed captions to ensure visual clarity and proper reproduction. 

f The handwritten captions depicted in Fig, 26 (e.g. in reference numbers 1301-1302) are 

illegible and/or are such that they cannot be adequately reproduced. These should be replaced 
with typed captions to ensure visual clarity and proper reproduction. 

g. The pattern depicted Figure 10 does not correlate to the corresponding discussion found in 

the last paragraph of page 23 of the Applicant's disclosure; or, its intended meaning cannot be 
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discerned from its deficient description in the aforementioned part of the Applicant's 
disclosure. For instance, Fig. 10 appears to depict the newly coded pattern corresponding to 
an arbitrary section (notice the dimensions of the section are different) of the pattems depicted 
in Figs. 6, 8 and 9. It cannot be determined from the Applicant's disclosure what section of 
the original pattem(s) this corresponds to or how it was obtained in the first place. See the 
discussion below relating to the Applicant's specification. 
A proposed drawing correction or corrected drawings are required in reply to the Office action to avoid 
abandonment of the application. The objection to the drawings will not be held in abeyance. 



Specification 

Objections 

2. Title of the Invention. The title of the invention is not descriptive. A new title is required that is clearly 
indicative of the invention to which the claims are directed. 

3. The disclosure is objected to because of the following informalities: 

a. On page 21, paragraph 2 (last sentence) of the Applicant's disclosure, the reference number 
305 is used to indicate a filter and an image pickup apparatus. The reference numbers 
attributed to these items should be changed to correspond to Fig. 3. 

b. On page 22, last paragraph to page 23, fnst paragraph, the Applicant refers to steps 1001- 
1007 of Fig. 4. These reference numbers are not shown in Fig. 4. These reference numbers 
should be changed to reflect what is shown in Fig. 4 or the reference numbers of Fig. 4 should 
be changed to correspond to this part of the disclosure. 

c. On page 41 , paragraph 4 of the Applicant's disclosure, the Applicant refers to an OHP 
(overhead projector) 2805. Fig. 28 shows OHP 2804. The specification should be corrected to 
indicate this. 



Application/Control Number: 09/892,884 Page 4 

Art Unit: 2623 

d. The pattern depicted Figure 10 does not correlate to the corresponding discussion found in 

the last paragraph of page 23 of the Applicant's disclosure; or, its intended meaning cannot be 
discerned from its deficient description in the aforementioned part of the Applicant's 
disclosure. For instance, Fig. 10 appears to depict the newly coded pattern corresponding to 
an arbitrary section (notice the dimensions of the section are different) of the patterns depicted 
in Figs. 6, 8 and 9. It cannot be determined from the Applicant's disclosure what section of 
the original pattem(s) this corresponds to or how it was obtained in the first place. If Fig. 10 is 
correct, then the Applicant should provide an adequate explanation of how to arrive at the 
pattern depicted in Fig. 10 (without introducing new matter) and what section of the patterns 
depicted in Figs, 6, 8, and 9 Fig. 10 corresponds to. 
Appropriate correction is required. 

4, According to C.F.R § 1 .73, the Summary of the Invention should be a ''brief summary of the invention 
indicating its nature and substance, which may include a statement of the object of the invention, should precede the 
detailed description". The Summary of the Invention supplied by the Applicant can hardly be considered brief. The 
applicant has chosen to merely repeat the Claims, instead of providing a summation of the invention and its 
advantages. As such, the submitted Summary is unnecessarily repetitive, redundantly repeating the proposed 
inventive features as in the claims. The applicant is advised to succinctly summarize the invention, its various 
embodiments and the objects of the invention. 

Claims 

Obiections 

5. Claims 6 and 14 are objected to because of the following informalities. On lines 2-4 of claim 6, the 
Applicant refers to "light of an invisible region". This terminology can be misleading. For example, light of an 
invisible region could be interpreted as meaning light emanating from an occluded region of the observed scene. 
Other misinterpretation are possible. It seems from the Applicant's disclosure that claim 6 was meant to refer to light 
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of an invisible region of the electromagnetic spectrum. This should be somehow indicated in claim 6. Claim 14 
should be changed similarly. Appropriate correction is required. 



Rejections Under 35 U.S.C. $ 1 12(2) 

6. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject 
matter which the applicant regards as his invention. 

7. Claims 7, 15, 22, 27, 32 and 37-40 are rejected under 35 U.S.C. 112, second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention. 

8. The following is in regard to Claims 17, 27, and 37-40. Claims 17, 27, and 37-40 recite the limitation 
"intensity image acquired in advance ". There is insufficient antecedent basis for this limitation in the claims. 

9. The following is in regard to Claims 7 and 15. These claims recite the limitation "creating second range 
information by bringing the area into correspondence with intensity information obtained by the first and second 
image pickup parts". It is not clear, even in light of the specification, what is meant by bringing the area into 
correspondence. While this could mean aligning the area with intensity information obtained by the fu-st and second 
image pickup parts, such alignment is not disclosed. However, the Applicant does derive correspondences between 
the "intensity information obtained by the first and second image pickup parts" (last paragraph on page 24 of the 
Applicant's disclosure). Therefore, for the remainder of this document it will be assumed that, in claims 7 and 15, 
"creating second range information by bringing the area into correspondence with intensity information obtained by 
the first and second image pickup parts" means "creating second range information by deriving a correspondence 
between intensity information obtained by the first and second image pickup parts". 

10. The following is in regard to Claim 22 and 32. In these claims, it is unclear as to what transforming the 
extracted character data to character data replaceable as a code value. The Applicant's disclosure does not resolve 
this ambiguity. Transforming the extracted character data to character data replaceable as a code value will be 
treated, in this document, as transforming extracted handwritten characters or text (e.g. by handwriting or optical 
character recognition) to standard character codes such as ASCII, or the like. 
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Rejections Under 35 U.S.C. § 102(b) 

1 1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that form the basis for the 

rejections under this section made in this Office action: 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public use or on sale in this 
country, more than one year prior to the date of application for patent in the United States. 

12. Claims 1, 8, 9, and 16 are rejected under 35 U.S.C. 102(b) as being anticipated by Kang et al. ("A 
Multibaseline Stereo System with Active Illumination and Real-Time Image Acquisition", IEEE, 1995). 

13. The following is in regard to Claim 9. Kang et al. disclose an depth reconstruction system utilizing multiple 
cameras and the projection of temporally coded structured light (Kang et al. Abstract). The method includes the 
following: 

(9.a.) Four cameras (i.e. a first, second, third, and fourth image pickup part) picks up stereo 
intensity images of a scene illuminated by an active illumination (i.e. a "projection 
pattern" of light having sinusoidally varying intensity). The cameras have disjoint optical 
axes that are separated by some fixed baseline distance. See paragraph 1 of Section 2 on 
page 88 of Kang et al. 

(9.b.) Recovering depth (range) data from analysis of the stereo images (i.e. "creating first 

range information based on the pattem picked up by the second image pickup part"). See 
the Abstract of Kang et al. 

(9.C.) Performing a geometric transformation (e.g. application of the rectified homographies 
Ki , K2 , and K3 - Kang et al. Fig. 5 and section 4.3) for the intensity image produced by 
the cameras based on the range information. Note that the rectified homographies, being 
2D affme transformations (Kang et al. page 90, right column, paragraph 4), are geometric 
transformations. 

It has thus been shown that the depth recovery method of Kang et al sufficiently conforms to the image processing 
method proposed by the Applicant in claim 9, Therefore, the teachings of Kang et al. anticipate the method of claim 
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14. The following is in regard to Claim 16, As shown above, Kang et al. disclose a depth recovery method that 
adequately satisfies the limitations of claim 9. The cameras used in the method of Kang et al. are have non-parallel 
optical axes. See, for example, Kang et al. Fig. 3 and the description of the Kang et al.'s verged camera 
configuration in Section 2.1. Therefore, the cameras (including the second camera or image pickup part) capture the 
measurement target at different angles. Depth (range) information, according to the method of Kang et al., is derived 
from the images of the scene having active illumination projected onto it. Taking this into account, the method of 
Kang et al. is such that the cameras (the image pickup parts), including the second camera, comprise plural image 
pickup parts that pick up the measurement target at different angles. The method also involves creating range 
information based on projection patterns respectively picked up by the plural image pickup parts of the cameras, 
including the second camera. This method is, therefore, in accordance with that which is proposed in claim 16. 

15. The following is in regard to Claim 1. Claim 1 recites substantially the same limitations as claim 9 (the 
claimed apparatus merely being physical manifestation of the method proposed in claim 9). Therefore, with regard 
to claim 1, remarks analogous to those presented above relating to claim 9 are applicable. 

16. The following is in regard to Claim 8. As shown above, Kang et al. adequately address the limitations of 
claim 1 . Claim 8 recites substantially the same limitations as claim 16 (the claimed apparatus merely being physical 
manifestation of the method proposed in claim 16). Therefore, with regard to claim 8, remarks analogous to those 
presented above relating to claim 16 are applicable. 



Rejections Under 35 U.S.C. § 103fa) 

17. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set 
forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if 
the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would 
have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 



18. Claims 2-4, 7, 10-12 and 15 are rejected imder 35 U.S.C. 103(a) as being unpatentable over Kang et al. in 
view of Batlle et al. ("Recent Progress in Coded Structured Light as a Technique to Solve the Correspondence 
Problem: A Survey", Pattern Recognition^ 1998). 
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19. The following is in regard to Claim 10. As shown above, Kang et al. disclose a depth recovery method that 
adequately satisfies the limitations of claim 9. Kang et al, however, does not expressly show or suggest that the 
range information be derived by: 

(lO.a.) For an area where the amount of change of the pattern picked up by the first image 
pickup part with respect to the projection pattem is equal to or greater than a 
predetermined value. 

(lO.b.) Assigning new code corresponding to the pattem picked up by the first image pickup 
part. 

(lO.c.) Creating the first range information from the pattem picked up by the second image 
pickup part based on the new code. 
Despite this, these aspects follow implicitly from Kang et al.'s usage of structured light, as shown below. 

20. Badle et al. present a survey of various coded structured light methods. With regard to claim 10, reference 
will be made here to the technique suggested by Posdamer and Altshuler and discussed in Section 6.1 (page 969) of 
Batlle et al. and Sato and Inokuchi's later extension of that methodology (BatUe et al. page 970). The technique 
includes the projection of a temporally varying binary pattem onto the scene (Batlle et al. Fig. 3). In this way, 
sections of the patterns are temporally encoded (Batlle et al. Fig, 4). Sato et al. propose the usage of a dynamic 
threshold indicative of the difference between the observed pattem intensity and the reference (projected) pattem 
intensity (Batlle et al. page 970, last paragraph of left column). Taking this into account, Batlle et al, thus teaches a 
coded stmctured light method that includes: 

(lO.a'.) For an area where the amount of change of the pattern picked up by the an image pickup 
part with respect to the projection pattem is equal to or greater than a predetermined 
value (Batlle et al. page 970, last paragraph of left column). 

(lO.b'.) Assigning new code corresponding to the pattem picked up by the image pickup part 
(e.g. by the temporal binary codification discussed above). 

(lO.c'.) Creating the furst range information from the pattem picked up by the image pickup part 
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based on the new code (Batlle et al. Abstract). 

21. The teachings of Batlle et al. and Kang et al. are combinable because they are analogous art. Specifically, 
Kang et al. describe a method that uses structured Hght principles in a passive stereo system, yet omit the details of 
the structured light enqjloyed. Batlle et al., on the other hand, discuss, in detail, various structured light methods, 
including one that involves steps (lO.a'.)-(lO.c'.) above. Therefore, it would have been obvious to one of ordinary 
skill in the art, at the time of the applicant's claimed invention, to utilize the coded structured light methodology 
discussed above and by Batlle et al. to codify the projected pattern used in Kang et al.'s reconstruction method. The 
motivation to do so would have been to distinguish pixels or groups of pixels in the captured images by determining 
which beam column (or other coded pattern segment) has been projected onto the scene (Batlle et al. Section 6.1, 
paragraph 3 (last sentence)). This increases the local discriminability at each pixel or pixel group, thereby, allowing 
the acquisition of accurate and dense depth data (Kang et al. page 88, right column, sentence 1). Using such a coded 
structured light scheme in the depth recovery method of Kang et al. yields a method where range data is derived by 

(lO.a.) For an area where the amount of change of the pattern picked up by the four cameras 
(including the first camera) with respect to the projection pattern is equal to or greater 
than a predetermined value. 

(lO.b.) Assigning new code corresponding to the pattern picked up by the cameras (including the 
first camera). 

(lO.c.) Creating the first range information fi*om the pattern picked up by the cameras (including 
the second camera) based on the new code. 
Such a method is in accordance with the image processing method proposed by the Applicant in claim 10. 

22. The following is in regard to Claim II. As shown above, Kang et al. disclose a depth recovery method that 
adequately satisfies the limitations of claim 9. The method of Kang et al., being a stereo-vision method, fiirther 
includes making comparison between captured fi:ame data images (Kang et al. Section 4.1). By incorporating the 
coded structured light of Batlle et al.'s teachings as discussed above, these captured fi-ame data images would be 
acquired in a time-series (recall the temporal encoding of the projected pattern discussed above). Therefore, the 
method obtained by combining the teachings Kang et al and Batlle et al., in the manner discussed above, would 
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include: making comparison between frame data images picked up in a time-series \ Furthermore, as indicated by 
Batlle et al. (e.g. Batlle et al. page 978, right column, lines 1-3), the presence of noise obfuscates the derivation of 
the correspondence between the images captured from the cameras. Therefore, in order to facilitate the derivation of 
a proper correspondence and, thereby, eliminate potential depth discontinuities, it would have been obvious to one 
of ordinary skill in the art, at the time of the applicant's claimed invention, to eliminate noise data from the frame 
data image based on a result of the comparison between the frame data images. Therefore, in this manner, the 
method obtained by combining the teachings Kang et al. and Batlle et al., in the manner discussed above, conforms 
to the method proposed by the Applicant in claim 1 1 . 

23. The following is in regard to Claim 12. As shown above, Kang et al. disclose a depth recovery method that 
adequately satisfies the limitations of claim 9. As discussed above, the teachings of Kang et al. and Batlle et al. can 
be combined^ to obtain a method, wherein a comparison is made between frame data images picked up in a time- 
series. Furthermore, Kang et al. teach a process rectification (Kang et al. page 89, Section 4, paragraphs 1-2) that 
involves modifying a position of the frame data image based on a result of the comparison between the frame data 
images. Therefore, in this manner, the method obtained by combining the teachings Kang et al. and Batlle et al., in 
the manner discussed above, conforms to the method proposed by the Applicant in claim 12. 

24. The following is in regard to Claim J 5. As shown above, Kang et al. disclose a depth recovery method that 
adequately satisfies the limitations of claim 9. In the method of Kang et al., as in most method utilizing stereo 
vision, correspondences are derived between captured stereo images, in processes known as stereo matching. See, 
for example, Kang et al. Abstract, Introduction (paragraph 1), and paragraph 1 of Section 4.1, in conjunction with 
Figs. 3-4. Note also that such correspondences are also embodied in the various homographies derived according to 
the method of Kang et al. (paragraph 3 of Kang et al. Section 4). These homographies provide a bijective map 
between points, and hence all areas, of adjacent images. These homographies are derived between all adjacent 
cameras (including the first and second). Kang et al, however, do not show determining an amount of change of the 
pattern captured by a camera with respect to the projection pattern and determining whether or not it is less than a 
predetermined value. 



1 The motivation to incorporate this coded structured light method into the method of Kang et al. was discussed above with regard to claim 10. 
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25. As discussed above (see the discussion above with respect to steps (lO.a)-(lO.c)), Batlle et al. teach 

a coded structured light method that includes: Evaluating whether an area is such that the amount of change of the 
pattern picked up by the an image pickup part with respect to the projection pattern is equal to or greater than a 
predetermined value (Batlle et al. page 970, last paragraph of left column). Consequently, the determination of 
whether not the amount of change is less than the predetermined value is made inplicitly for areas deemed not to 
satisfy this criterion. 

26. The teachings of Batlle et al. and Kang et al. are combinable because they are analogous art. Specifically, 
Kang et al, describe a method that uses structured light principles in a passive stereo system, yet omit the details of 
the structured light employed. Batlle et al., on the other hand, discuss, in detail, various structured light methods. 
Therefore, it would have been obvious to one of ordinary skill in the art, at the time of the applicant's claimed 
invention, to evaluate whether or not an area is such that the amount of change of the pattern picked up by the an 
image pickup part with respect to the projection pattern is less than a predetermined value. As indicated by Batlle et 
al. (Batlle et al. page 970, last paragraph of left column), this has the advantage of distinguishing between high 
contrast transitions that are part of the observed scene itself, as opposed to high contrast transitions that occur as a 
result of the projected pattern. Combining the teachings of Batlle et al. and Kang et al. yields a method of depth 
recovery that includes: 

(15, a,) Evaluating whether or not areas of a captured images is such that the amount of change 

of the pattern picked up by the an image pickup part with respect to the projection 

pattem is less than a predetermined value, 
(15.b.) Deriving correspondences between all adjacent (including those obtained from the first 

and the second cameras) stereo images (including those obtained from the furst and the 

second cameras) 

Since, in such a method, correspondences are derived for all pixels in the captured images (including those having 
an amount of change less than the predetermined value), the method obtained as such conforms sufficiently to that 
of claim 15. 

27. The following is in regard to Claims 2-4 and 7. As shown above, Kang et al. adequately address the 
limitations of claim 1, Claims 2-4 recites substantially the same Umitations as claims 10-12 and 15, respectively. 
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(The claimed apparatuses merely implement the corresponding methods of claims 10-12 and 15). Therefore, with 
regard to claims 2-4 and 7, remarks analogous to those presented above relating to claims 10-12 and 15 are 
respectively applicable. 

28. Claims 6 and 14 are rejected under 35 U.S.C. 103(a) as being unpatentable over Kang et al., in view of 
Mack et al. (U.S. Patent 6,377,700). 

29. The following is in regard to Claim 14, As shown above, Kang et al. disclose a depth recovery method that 
adequately satisfies the hmitations of claim 9. Though Kang et al. show the capture of the intensity image of the 
scene having an active illumination projected onto it (i.e. the pattern projection image and intensity image picked up 
in parallel - see above), Kang et al. does not expressly show or suggest casting a pattern Hght formed by invisible- 
region hght such as infrared or ultraviolet light. 

30. As noted above, the active illumination acts a structured light mechanism that facilitates the local 
discrimination of corresponding pixels among the captured images. Mack et al. discloses a 3D reconstruction 
method that uses structured light projected onto the scene or object under observation (e.g. Mack et al. Abstract and 
Field of Invention). Mack et al. suggests the usage of infrared light patterns as the structured light projected onto the 
scene. See, for example, Mack et al. column 5, lines 15-21 and lines 49-58. 

3 1 . The teachings of Mack et al. and Kang et al. are combinable because they are analogous art. In particular, 
both Kang et al. and Mack et al. disclose a stereoscopic techniques of stereoscopic depth recovery or 3D 
reconstruction that employ structured lighting. Therefore, given the teachings of Mack et al., it would have been 
obvious to one of ordinary skill in the art, at the time of the applicant's claimed invention, to use an active 
illumination, or structured lighting, that is formed of patterns of infrared illumination in the depth recovery method 
of Kang et al The motivation for doing so would have been to project patterns onto the scene that are invisible and 
less offensive to the human eye. The method thus obtained would conform to the image processing method proposed 
by the Applicant in claim 14. 

32. The following is in regard to Claim 6. As shown above, Kang et al adequately address the limitations of 
claim 1. Despite its awkward wording (see discussion above relating to the objection of this claim), claim 6 recites 
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substantially the same limitations as claim 14. (The claimed apparatus merely implements the corresponding method 
of claims 14). Therefore, with regard to claim 6, remarks analogous to those presented above relating to claim 14 are 
applicable. 

33. Claims 5 and 13 are rejected under 35 U.S.C. 103(a) as being unpatentable over Kang et al., in view of 
Perona et al. (U.S. Patent 6,377,700). 

34. The following is in regard to Claim 13, As shown above, Kang et al. adequately address the limitations of 
claim 1. Kang et al., however, do not expressly show or suggest: 

(13. a.) Storing in a storage part an initial image of frame data picked up in a time-series. 
(13.b.) Making comparison between frame data images picked up in a time-series. 
(13.C.) Extracting only differential data as storage data based on a result of the comparison 
between the initial frame data and frame data subsequently picked up. 

35. Perona et al. disclose a method and apparatus for tracking handwriting by detecting the movement of a 
writing implement relative to a writing surface relative to a writing surface. Perona et al. suggest a stereo- vision 
approach to determining the 3D position (particularly depth information) of the writing implement during the 
tracking (Perona et al. column 7, lines 66-67 to column 8, lines 1-6 and 23-28). To capture the movements of the 
writing, Perona et al. capture multiple frames of images (stereo pairs, if stereo is used) and extracts the difference 
between the adjacent frames. See Perona et al. Fig. 4, steps 400-402 and column 5, lines 30-35. This clearly 
addresses steps (13.a.)-(13.c.) 

36. The teachings of Perona et al. and Kang et al. are combinable because they are analogous art. Perona et al, 
suggest the usage of stereo- vision techniques to acquire 3D information of the writing surface, while Kang et al. 
acquiring 3D information of an observed scene via stereo-vision techniques. Therefore, it would have been obvious 
to one of ordinary skill in the art, at the time of the applicant's claimed invention, to extend the stereo- vision depth 
recovery methodology of Kang et al. to the handwriting tracking method of Perona et al. The motivation to do so 
would have been to exploit the accuracy of Kang et al. method of depth recovery (Kang et al. Abstract) to detect and 
track the 3D dimensional position of a writing implement, according to Perona et al.'s method. Combining the 
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teachings of Kang et al. and Perona et aL, in this manner, produces a method that sufficiently conforms to that of 
claim 13. 

37. The following is in regard to Claim 5. Claim 5 recites substantially the same limitations as claim 13. (The 
claimed apparatus merely implements the corresponding method of claims 13). Therefore, with regard to claim 5, 
remarks analogous to those presented above relating to claim 13 are applicable. 

38. Claims 27-30 are rejected under 35 U.S.C. 103(a) as being unpatentable over Saund et al. (U.S. Patent 
5,764,383)^, in view of Perona et al. 

39. The following is in regard to Claim 27. Saund et al, disclose a platenless book scanning system and method 
that uses structured light to obtain a 3D dimensional representation of the observed book (e.g. Saund et al (925) 
Abstract). The method of Saund et al. includes the following: 

(27.a.) Projecting light to an image holding medium (e.g. platform 8 of Saund et al. (383) or 

Saund et al. (925) Fig. 1) to form an image thereon. See, for example, Saund et al. (925) 
column 6, lines 19-35. 

(27.b.) Capturing the image (e.g. image I] - Saund et al. (925) column 7, lines 49-50) projected 

on the infiage holding medium. See, for exanqjle, Saund et al. (925) column 6, lines 6-18. 
(27.C.) Acquiring an intensity image (e.g. image I2 - Saund et al. (925) column 7, lines 53-54) 

based on the image picked up in the image pickup step. See, for example, Saund et al. 

(925) column 6, lines 6-18. 
(27.d.) Acquiring range information from the picked-up image (e.g. Saund et al. (925) column 6 

(lines 19-21), column 7 (lines 16-21) and Figs. 8-9). 
(27.e.) Performing geometric transformation (e.g. skew correction via the page shape transform) 

for the intensity image based on the range information acquired in the range information 

2 Notice that, in column 8, lines 31-37 of U.S. Patent 5,764,383, Saund et al. incorporate by reference U.S. Patent Application 09/657,71 1 (now U.S. Patent 

5,760,925 issued to the same inventors and assignee). In accordance with MPEP § 2163.07(b), all material disclosed therein will be treated as part of U.S. Patent 
5,764,383. Consequently, when referring to Saund et al. in this document, reference will actually be made to both U.S. Patent 5,764,383 and U.S. Patent 
5,760,925. For the sake of clarity, any specific references to either U.S. Patent 5,764,383 or U.S. Patent 5,760,925 will be denoted as Saund et al. (383) or Saund 
et al. (925), respectively. 
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acquisition step. Refer to Saund et al. (383) Fig. 7 and column 7, lines 63-67 to column 8, 
lines 1-8. 

Though Saimd et al. suggest the subtraction - i.e. the extraction of the difference - of intensity images (e.g. image 
data 32 ( I2) and page shape data (Ii) - Saund et al. (925) column 15, lines 24-32), Saund et al does not expressly 
show or suggest (27.f ) extracting difference between the geometric-transformed intensity image and the intensity 
image acquired in advance, 

40. Perona et al. disclose a method and apparatus for tracking handwriting by detecting the movement of a 
writing implement relative to a vmting surface relative to a writing surface. Perona et al. suggest a stereo- vision 
approach to determining the 3D position (particularly depth information) of the writing implement during the 
tracking (Perona et al. column 7, lines 66-67 to column 8, lines 1-6 and 23-28). To capture the movements of the 
writing, Perona et al. capture multiple frames of images (stereo pairs, if stereo is used) and extracts the difference 
between the adjacent frames. See Perona et al. Fig. 4, steps 400-402 and column 5, lines 30-35. 

41. The teachings of Perona et al. and Saund et al. are combinable because they are analogous art. First note the 
structural similarities of the systems of Perona et al. and Saund et al., depicted respectively in Perona et al. and 
Saund et al. Fig. 1. Secondly, Perona et al. suggest the usage of stereo- vision techniques to acquire 3D information 
of the writing surface. As is well-known, structured hght methods, such as that of Saund et al. are often 
interchangeable with stereo-vision methods as a means to obtain 3D information of an observed scene. Therefore, it 
would have been obvious to one of ordinary skill in the art, at the time of the applicant's claimed invention, to 
employ the structured light method of Saund et al. (steps (27.a.)-(27.e.) above) to obtain a skew-corrected 3D 
representation of a dynamic writing surface^ in a writing tracking method, such as that of Perona et al. One 
motivation to extend the scanning method of Saund et al. so as to accommodate the tracking of handwriting would 
have been to construct a "digital desktop". Perona et al. suggests this in column 2, lines 36-43 and lines 49-53. 
Incorporating the teachings of Perona et al. into the method of Saund et al, yields a method including steps (27.a)- 
(27.e) above, as well as (27. f): extracting difference between the geometric-transformed intensity image'* (e.g. Fio of 

3 The writing surface (i.e. the book) being scanned in Saund et al.'s method is static, in the sense that the text is fixed, as opposed to being a dynamic writing 
surface having handwriting added to it 

4 Presumably each frame, in such a combined method, would undergo the skew correction proposed by Saund et al., if necessary. 
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Perona et al Fig. 4) and the intensity image acquired in advance (e.g. Fi of Perona et al. Fig. 4). That is, the method 
thus obtained conforms sufficiently to the image processing method proposed by the Applicant in claim 27. 

42. The following is in regard to Claim 28. As shown above, the teachings of Saund et al. and Perona et al., 
when combined in the manner discussed above, adequately address all subject matter set forth by the Applicant in 
claim 27. Perona et al. suggest that the image holding medium, or writing surface, be a whiteboard or manuscript 
sheet (Perona et al. column 3, lines 52-57). Saund et al. also suggest that the subject of the scanning may be a 
manuscript sheet (e.g. the scanned pages of book 10 depicted in Saund et al. (925) Fig. 1). In this way, the method 
obtained by combining the teachings of Saund et al. and Perona et al. in the manner discussed above, conforms 
sufficiendy to the image processing method of claim 28. 

43. The following is in regard to Claim 29, As shown above, the teachings of Saund et al. and Perona et al., 
when combined in the manner discussed above, adequately address all subject matter set forth by the Applicant in 
claim 27. As discussed above with regard to claim 27, the intensity image acquired in advance as a processing target 
in the image extracting step is a preceding frame image inputted precedent to the geometric transformation step. In 
fact, this limitation follows directly from the language of claim 27 (". ..geometric-transformed intensity image and 
the intensity image acquired in advance"). In this way, the method obtained by combining the teachings of Saund et 
al. and Perona et al. in the manner discussed above, conforms sufficiently to the image processing method of claim 
29. 

44. The following is in regard to Claim 30, As shown above, the teachings of Saund et al. and Perona et al., 
when combined in the manner discussed above, adequately address all subject matter set forth by the Applicant in 
claim 27. Clearly, if one desires to process an image, or any data for that matter, acquired at some prior time, then 
the associated data must be stored in some means of storage, at the time of its acquisition (i.e. in advance of the 
processing). Therefore, the feature proposed in claim 30 would be inherent to method obtained by combining the 
teachings of Saimd et al. and Perona et al. in the manner discussed above. 



45. Claims 17-24, 26-27, 3 1-34, and 36-40 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Saund et al. (U.S. Patent 5,764,383), in view of Stolfo (U.S. Patent 5,668,897). 
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46. The following is in regard to Claims 27 and 31. Saund et al. disclose a platenless book scanning system 
and method that uses structured light to obtain a 3D dimensional representation of the observed book (e.g. Saimd et 
al. (925) Abstract). The method of Saund et al includes the following: 

(27.a.) Projecting light to an image holding medium (e.g. platform 8 of Saund et al. (383) or 

Saimd et al. (925) Fig. 1) to form an image thereon. See, for example, Saund et al. (925) 
column 6, lines 19-35, 

(27.b.) Capturing the image (e.g. image Ii - Saund et al (925) column 7, lines 49-50) projected 

on the image holding medium. See, for example, Saund et al. (925) colxmin 6, lines 6-18. 
(27.C.) Acquiring an intensity image (e.g. image I2 - Saund et al. (925) column 7, lines 53-54) 

based on the image picked up in the image pickup step. See, for example, Saund et al. 

(925) column 6, lines 6-18. 
(27.d.) Acquiring range information from the picked-up image (e.g. Saund et al. (925) column 6 

(lines 19-21), column 7 (lines 16-21) and Figs. 8-9). 
(27.e.) Performing geometric transformation (e.g. skew correction via the page shape transform) 

for the intensity image based on the range information acquired in the range information 

acquisition step. Refer to Saund et al. (383) Fig. 7 and column 7, lines 63-67 to column 8, 

lines 1-8. 

Saund et al., however, do not expressly show or suggest: 

(3 1 .a.) Storing plural pieces of document format data in a document database. 

(3 1 .b.) Performing matching between a geometric-transformed intensity image and the pieces of 

document format data stored in the document database. 
(3 1 .c.) Extracting a difference between the geometric-transformed intensity image and the 

pieces of document format data stored in the document database^. 



5 Note that step (3 1 .c) essentially consists of (27.f) extracting difference between the geometric-transformed intensity image and the intensity image acquired in 
advance, where the latter intensity image would correspond to the document format data. 
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47. Many so-called "form drop-out" techniques perform these steps. For example, Stolfo discloses a 
methodology for extracting or separating handwritten text from standardized documents, such as various types of 
checks (Stolfo, Summary of Invention, paragraph 1 and column 21, lines 5-11). According to this methodology: 

(3 1 .a'.) Plural pieces of document format data^ are stored in a document database. This is 

suggested in Stolfo column 7, lines 23-25. 
(3 1 .b'.) A scanned (intensity) image is matched to the pieces of document format data stored in 

the document database (Stolfo column 7, lines 8-22). 
(3 1 .c'.) A subtraction (i.e. extraction of the difference) is performed between the intensity image 

and the pieces of document format data stored in the document database (Stolfo column 

7, lines 19-22). 

48. The teachings of Saund et al. and Stolfo are combinable because they are analogous art. Specifically, Saund 
et al. demonstrate a method for scanning a document that eliminates skew. Stolfo, on the other hand, discuss a 
method of handwriting extraction that calls for the document scanning. In particular, Stolfo 's method calls for 
scanned documents to be, among other things, skew-corrected (e.g. Stolfo column 14, lines 62-66 and column 3, 
lines 29-43). Therefore, it would have been obvious to one of ordinary skill in the art, at the time of the appUcant's 
claimed invention, to apply the platenless book scanning methodology of Saund et al. to a handwriting extraction 
method such as that of Stolfo, thereby generating a geometrically-transformed intensity image for which to process. 
The motivation for doing so would have been to provide handwriting extraction from standardized documents that 
have been skew-corrected. Combining these methods in this manner yields a handwriting extraction method having 
steps (27.a)-(27.e) above as well as the steps of: 

(3 1 .a.) Storing plural pieces of document format data in a document database. 

(3 1 .b.) Performing matching between a geometric-transformed intensity image and the pieces of 

document format data stored in the document database. 
(3 I.e.) Extracting a difference between the geometric-transformed intensity image and the 

pieces of document format data stored in the document database. 



6 The (somewhat misleading) term document format data is taken in this case to mean a standard document image or tenqjlate. This is consistent with the 
Applicant's usage of the term. 



Application/Control Number: 09/892,884 
Art Unit: 2623 



Page 19 



The method thus obtained adequately satisfies the Hmitations of claims 27 and 31. 



49. 



The following is in regard to Claim 32. As shown above, the teachings of Saund et al. and Stolfo can be 



combined so as to yield a method that sufficiently conforms to the image processing method of claim 27. Stolfo 
suggests the usage of handwriting recognition or optical character recognition on the handwriting from the 
standardized documents (e.g. Stolfo column 4, lines 58-63). This sufficiently addresses the limitations of claim 32. 
Therefore, the method obtained by combining the teachings of Stolfo and Saund et al., in the manner discussed 
above, conforms substantially to the image processing method claim 32, 

50. The following is in regard to Claims 33-34, As shown above, the teachings of Saund et al. and Stolfo can 
be combined so as to yield a method that sufficiently conforms to the image processing method of claim 27. 
According to Stolfo (Stolfo column 26, lines 66-67 to column 27, lines 1-6), the extracted handwriting, specifically 
the check signature, can be authenticated against a database of signatures (i.e. a "handwriting history data of 
registered users in a authentication information database"). Clearly, performing matching between the input image 
and the handwriting history data stored in the authentication information database would be inherent to such an 
authentication process. Furthermore, given the discussion above, it can be reasonably assumed that the geometric- 
transformed intensity image serves as input. Therefore, the method obtained by combining the teachings of Stolfo 
and Saund et al, in the manner discussed above, conforms substantially to the image processing method claim 33. In 
addition, since the authentication taught by Stolfo is of the signature extracted from the scanned document, the 
method thus obtained also adequately satisfies the limitations of claim 34, 

5 1 , The following is in regard to Claim 36. As shown above, the teachings of Saund et al. and Stolfo can be 
combined so as to yield a method that sufficiently conforms to the image processing method of claim 27. It should 
be (overwhelmingly) apparent that the range information would be stored after being derived. Furthermore, it should 
be apparent from Saund et al. (e.g. Saund et al, (383) Fig. 1) that the camera is in a fixed position relative to the 
document, or more particularly, the document holder'. In this way, the method, obtained by combining the teachings 
of Stolfo and Saund et al. as discussed above, conforms substantially to the image processing method claim 36. 



7 If it were not the various transforms (in particular the perspective transform) derived according to Saund et al. would be invalid, or the entire 
system would have to be constantly recalibrated. One of ordinary skill in the art would easily recognize this. 
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52. The following is in regard to Claim 17-24 and 26. As shown above, teachings of Saund et al. and Stolfo, 
when combined in the manner discussed above, adequately address the hmitations of claim 17. Claims 17-24 and 26 
recite substantially the same limitations as claims 27-24 and 26, respectively. (The claimed apparatuses merely 
implement the corresponding methods of claims 27-24 and 26). Therefore, with regard to claims 17-24 and 26, 
remarks analogous to those presented above relating to claims 27-24 and 26 are respectively applicable. 

53. The following is in regard to Claim 37. Claim 27 recites substantially the same limitations as claim 27. 
(The claimed storage medium contains a program that merely implements the corresponding methods of claims 27). 
Therefore, with regard to claim 37, remarks analogous to those presented above relating to claim 27 are applicable. 

54. The following is in regard to Claims 38-40, Note that since capturing an image of a scene (the image 
holding medium) having projected light cast upon it is essentially the same as picking up the projected light, claims 
38-40 do not introduce anything substantively different from what is claimed in claims 17, 27, and 37, respectively. 
Therefore, with regard to claims 38-40, remarks analogous to those presented above relating to claims 17, 27, and 37 
are respectively applicable. 



55. Claims 25 and 35 are rejected imder 35 U.S.C. 103(a) as being unpatentable over Saund et al., in view of 
Stolfo, in further view of Wellner (U.S. Patent 5,51 1,148). 

56. The following is in regard to Claim 35. As shown above, the teachings of Saund et al. and Stolfo can be 
combined to satisfies the limitations of claim 27. However, neither Saund et al, nor Stolfo expressly show or suggest 
displaying an image produced as a result of performing geometric transformation for the intensity image based on 
the range information. 

57. Wellner disclose an "interactive digital desktop", that is, a system for generating new dociunents from 
originals containing text and/or images employing e.g. a camera-projector system focused on a work surface 
(Wellner Abstract). In Wellner's system (depicted m Wellner Fig. 1) a video projector 8 is mounted adjacent the 
camera 6 and projects onto the surface 2 a display 21 which is generally coincident with the field of view of the 
camera 6, and which, in the example shown, includes an image of a newly created document 20. This projected 
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document image is skew-corrected (Wellner column 10, lines 61-62). In this way, Wellner teaches displaying an 
image produced as a result of performing geometric transformation for the intensity image. 

58. The teachings of Welhier are combinable with those of Saund et al. and Stolfo because they are analogous 
art. Specifically, Welkier's teachings are related to both platenless scanning (see Welbier Fig. 1) and text extraction 
and recognition (Welbier column 9, lines 60-63). Therefore, it would have been obvious to one of ordinary skill in 
the art, at the time of the applicant's claimed invention, to incorporate the projection or display functionality of 
Wellner's digital desktop into the method obtained by combining the teachings of Stolfo and Saund et al., as 
discussed above, wherein skew-correction is performed according to the method proposed by Saund et al. The 
motivation for doing so would have been to provide the user with visual feedback relating to the alignment of the 
document. Incorporating this display functionality in this way yields a method, in accordance with claim 27, further 
configured to display an image produced as a result of performing geometric transformation for the intensity image 
based on the range information. A method thus obtained sufficiently conforms to the image processing method 
proposed by the Applicant in claim 35. 

59. The following is in regard to Claim 25. Claim 25 recites substantially the same limitations as claim 38. 
(The claimed apparatus merely implements the method of claim 35). Therefore, with regard to claim 25, remarks 
analogous to those presented above relating to claim 35 are applicable. 



Citation of Relevant Prior Art 

60. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure: 
The following are systems and/or methods that combine or otherwise utilize both passive (e.g. stereo-pairs) and 
active (e.g. structured light) stereo vision techniques: 

[1] U.S. Patent 6,028,672. Geng. Publication Date: Feb. 2000 

[2] U.S. Patent 6, 750,873. Bemardini et al. Publication Date: Jul. 2004 

[3] Passive and Active Stereo Vision for Smooth Surface Detection of Deformed Plates. IEEE 
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1995, Chen, et al. 
[5] U.S, Patent 6,674,893, Abe et al. Publication Date: Jan. 2004 
[6] US. Patent 6,356,298, Abe et al. Publication Date: Mar. 2002 
[7] U.S. Patent 4,687,326. Corby, Jr. Publication Date: Aug. 1987 
[8] Spacetime Stereo: Shape Recovery for Dynamic Scenes, IEEE, 2003. Zhang et al. 
The following references deal with the extraction and recognition of text via 3D imaging of a document or scene: 
[9] U.S. Patent 5,581,637. Cass et al. Publication Date: Dec. 1996 
[10] US. Patent 6,563,948. Tan et al. Publication Date: May. 2003 

[11] Passive and Active Stereo Vision for Smooth Surface Detection of Deformed Plates. IEEE 
1995, Chen, et al. 

The following reference detects and removes regions of an image corresponding to fingers and the like: 

[12] US, Patent 6,256,411. lida. Publication Date: July 2001 
The following reference deal with extraction of text or handwriting from images, and more generally, the process 
known as "form drop-out": 

[13] U.S. Patent 6, 023, 534. Handley. Publication Date: Feb. 2000 

[14] US. Patent 6, 701,013, Charpentier. Publication Date: Mar. 2004 

[15] US. Patent 6,320,983. Matsuno et al. Publication Date: Nov. 20 

[16] US. Patent 5,201,011. Bloomberg et al. Publication Date: Apr. 1993 

[17] A System to Read Names and Addresses on Tax Forms. IEEE, 1996. Sargur et al. 

[ 1 8] User-Defined Template for Identifying Document Type and Extracting Information from 

Documents. IEEE, 1999Kochi et al. 
[19] A Generic System to Extract and Clean Handwritten Data From Business Forms. 

Proceedings of the Workshop on Frontiers in Handwriting Recognition, September 2000. Ye 
et al. 

The following references are related to "interactive digital desktops" and related collaborative environment systems: 
[20] Interacting with Paper on the Digital Desk. ACM, 1993. Welhier. 
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[21] Interactive Design of Seamless Collaboration Media, ACM, 1994. Ishii et al. 

[22] Something From Nothing: Augmenting a Paper-based Work Practice via Multimodal 

Interaction. ACM, 2000. McGee et al. 
[23] VideoDraw: A Video Interface for Collaborative Drawing. ACM, 1991. Tang et al. 
[24] Adaptive Annotation Using a Human-Robot Interface System PARTNER. IEEE, 2001 . 

Yamashita et al. 
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